This handbook is a complete, implementation-level walkthrough of ACM V8 for new maintainers. It covers the end-to-end data flow, the role of every module, configuration surfaces, and the reasoning behind each major decision so that a new engineer can operate, extend, and hand off the system confidently.
+--------------+ +-----------------+ +----------------+ +--------------------+
| Ingestion | | Feature Builder | | Detector Heads | | Fusion & Episodes |
| (CSV / SQL) | -> | (fast_features) | -> | (PCA/MHAL/IF/GMM|
| acm_main | | | | AR1/OMR/etc.) | | (fuse) |
+--------------+ +-----------------+ +----------------+ +--------------------+
| | | |
v v v v
+--------------+ +-----------------+ +----------------+ +--------------------+
| Regimes | | Calibration | | Drift | | Outputs & SQL |
| (regimes) | | (z-scores, per- | | (cusum) | | (OutputManager) |
+--------------+ | regime/adaptive)| +----------------+ +--------------------+
\ |
\-> Forecast/RUL (forecasting, rul_*) <-/
artifacts/{EQUIP}/run_<ts>/ per-run tables/charts + artifacts/{EQUIP}/models/ for cached detectors and forecast/regime state.CLI: python -m core.acm_main --equip <EQUIP> [--train-csv ... --score-csv ...] [--config ...] [--clear-cache] [--log-level ...] [--disable-sql-logging]
SQL batch automation: python scripts/sql_batch_runner.py --equip FD_FAN [--resume --max-workers N --tick-minutes 240]
Uses SQL historian tables and calls usp_ACM_StartRun/usp_ACM_FinalizeRun. Handles cold-start retries and progress tracking (.sql_batch_progress.json).
File mode helper: powershell ./scripts/run/run_file_mode.ps1 (wraps acm_main with CSV defaults).
Mode decision:
ACM_FORCE_FILE_MODE=1.--equip selects the config row (SQL ACM_Config or configs/config_table.csv fallback).ACM_BATCH_MODE=1 toggles batch-run semantics (continuous learning hooks).configs/config_table.csv or SQL ACM_Config)Key paths (ParamPath) and reasoning:
data.train_csv, data.score_csv, data_dir, timestamp_col, tag_columns, sampling_secs, max_rows: defines ingestion sources and schema expectations.features.window, features.fft_bands, features.top_k_tags, features.fs_hz: controls rolling window size and spectral bins; window must match process dynamics to capture oscillations without oversmoothing.models.* (pca, ar1, iforest, gmm, omr): detector hyperparameters; tighter contamination lowers false positives, higher pca.n_components trades explainability vs. residual sensitivity.thresholds.*: fused/detector z clipping, quantiles for calibration.fusion.*: detector weights and auto-tuning knobs (episode separability-based).regimes.*: auto-k bounds, smoothing, transient detection, health thresholds; smaller k_min/k_max avoids over-segmentation on short baselines.drift.*: CUSUM thresholds and drift aggregation.runtime.*: version tag, heartbeat, reuse_model_fit, baseline buffer, phases toggles.output.*: dual_mode (file + SQL), enable_forecast, enable_enhanced_forecast, destinations.configs/sql_connection.ini (or .example.ini) supplies DSN/user/pass; SQLClient.from_ini loads it.--config loads a YAML overrides file merged atop the config table row.--train-csv / --score-csv override ingestion paths per run.--clear-cache forces detector refit, ignoring cached joblib.--log-level, --log-format, --log-module-level, --log-file, --disable-sql-logging.core/acm_main.py_get_equipment_id, _load_config, _compute_config_signature, _ensure_local_index, _sql_start_run/_sql_finalize_run, _calculate_adaptive_thresholds, _compute_drift_trend, _compute_regime_volatility.OutputManager.load_data; SQL via SmartColdstart.load_with_retry. Dedup timestamps, enforce monotonicity, guard empty SCORE (NOOP).ACM_BaselineBuffer, local baseline_buffer.csv, or SCORE head when baseline thin.tables/data_quality.csv or SQL ACM_DataQuality), cadence checks.fast_features.compute_basic_features with Polars fast-path; uses TRAIN medians to impute SCORE (prevents leakage).fuse.Fuser.fuse combines z streams under configured weights; auto-tunes weights via episode separability (tune_detector_weights).fuse.Fuser.detect_episodes finds sustained excursions with hysteresis; writes culprits via episode_culprits_writer.drift.compute), plots via drift.run.forecasting.run_and_persist_enhanced_forecasting and rul_estimator/enhanced_rul_estimator.OutputManager for CSV/PNG/SQL (scores, drift, events, regimes, PCA artifacts, health timelines, fusion quality, OMR contributions, etc.). Writes run stats, run metadata (run_metadata_writer), config history, culprits, caches models._sql_finalize_run; writes meta.json in file mode only.core/fast_features.pyrolling_median, rolling_mad, rolling_mean_std, rolling_skew_kurt, rolling_ols_slope.rolling_spectral_energy, rolling_xcorr, rolling_pairwise_lag, compute_basic_features(_pl).core/correlation.py):
MahalanobisDetector: ridge-regularized covariance with NaN audits; guards ill-conditioning and under-sampled cases.PCASubspaceDetector: cleans non-finite, drops constants, scales, fits PCA; returns SPE (Q) and T2; handles low-sample fallback.core/outliers.py):
IsolationForestDetector: fits/uses scikit IF, stores columns, optional quantile threshold when contamination numeric.GMMDetector: BIC-driven component selection, variance guards, scaling; returns neg log-likelihood style scores.core/omr.py and core/omr_new.py):
omr_new adds auto model selection, diagnostics, per-sensor contribution extraction, z clipping, min sample guards.core/forecasting.py, core/rul_estimator.py, core/enhanced_rul_estimator.py):
core/drift.py): CUSUMDetector with z calibration; report plot generator.core/regimes.py): feature basis builder, auto-k (silhouette/Calinski-Harabasz), smoothing, transient detection via ROC energy, health labeling, persistence to joblib/json, loading with version guard.core/fuse.py): Fuser.fuse (weighted sum with z clipping), detect_episodes (hysteresis thresholds), ScoreCalibrator, tune_detector_weights (PR-AUC against episode windows).artifacts/{equip}/models/vN, manifest generation, SQL dual-write hooks. Forecast state save/load helpers.SQLBatchWriter for efficient inserts.scripts/sql_batch_runner.py: continuous SQL processing with ticked windows, resume, cold-start retries, historian coverage checks.check_*, validate_*, monitor_*, analyze_* for dashboards, data gaps, drift, forecast status, table population.scripts/sql/*: SQL schema helpers and tests.data/ for CSV baselines/batches; SQL mode pulls from historian tables configured per equipment.artifacts/{EQUIP}/run_<timestamp>/ with scores.csv, drift.csv, episodes.csv, tables/*.csv, charts/*.png, meta.json (file mode).artifacts/{EQUIP}/models/ containing detectors.joblib, regime model joblib/json, forecast state, baseline buffer.grafana_dashboards/ holds JSON panels and docs for visualization alignment.python -m venv .venv && .\\.venv\\Scripts\\activate (Windows).pip install -r requirements.txt (or pip install .).configs/config_table.csv (or SQL ACM_Config). Ensure data.train_csv and data.score_csv exist for file mode.configs/sql_connection.ini with DSN/credentials and historian table names.python -m core.acm_main --equip FD_FAN ^
--train-csv data/FD_FAN_BASELINE_DATA.csv ^
--score-csv data/FD_FAN_BATCH_DATA.csv ^
--log-level INFO
python -m core.acm_main --equip FD_FAN --log-level INFO
# Uses SQL config row, historian, and SQL sinks. Disable SQL log sink with --disable-sql-logging.
python scripts/sql_batch_runner.py --equip FD_FAN GAS_TURBINE --max-workers 2 --tick-minutes 240 --resume
artifacts/FD_FAN/run_<ts>/tables/ and charts/.features.window, features.fs_hz).main (CLI), _sql_start_run, _sql_finalize_run, _calculate_adaptive_thresholds, _compute_drift_trend, _compute_regime_volatility, _load_config, _compute_config_signature, _nearest_indexer, _ensure_local_index.compute_basic_features, Polars fast-path helpers.MahalanobisDetector.fit/score, PCASubspaceDetector.fit/score.IsolationForestDetector.fit/score/predict, GMMDetector.fit/score.OMRDetector.fit/score/get_top_contributors/get_diagnostics, OMRModel.to_dict.Fuser.fuse/detect_episodes, ScoreCalibrator.fit/transform, tune_detector_weights.build_feature_basis, build_regime_model, apply_regime_labels, detect_transient_states, save_regime_model, load_regime_model.CUSUMDetector.fit/score, compute, run.AR1Detector, estimate_rul, run_and_persist_enhanced_forecasting, run_enhanced_forecasting_sql, should_retrain, compute_data_hash._simple_ar1_forecast, estimate_rul_and_failure, compute_rul_multipath, degradation model classes, attribution and maintenance recommendations.AdaptiveThresholdCalculator.calculate_fused_threshold/_calculate_per_regime, calculate_warn_threshold.create_output_manager, write_scores_ts, write_drift_ts, write_anomaly_events, write_regime_episodes, write_pca_model/loadings/metrics, analytics generators (_generate_*), flush/close.ModelVersionManager.get_*, save_models/load_models, save_forecast_state/load_forecast_state, manifest helpers.write_run_metadata, extract_run_metadata_from_scores, extract_data_quality_score, write_retrain_metadata.write_config_change(s), log_auto_tune_changes.compute_detector_contributions, write_episode_culprits_enhanced.SmartColdstart.load_with_retry, cadence detection, progress tracking.SQLClient, SqlLogSink, SQLPerformanceMonitor, SQLBatchWriter, sql_protocol mocks.SQLBatchRunner orchestrates continuous historian processing.--equip pointed at a known equipment row; verify ACM_Runs, Scores_Wide, Episodes populate.artifacts/{EQUIP}/run_<ts>/meta.json (file mode) or ACM_Runs (SQL) to confirm health indices and thresholds.config_history.log to ensure auto-tuning events are recorded.grafana_dashboards/ JSONs and docs/ quick refs (e.g., BATCH_MODE_SQL_QUICK_REF.md, SQL_MODE_CONFIGURATION.md).scores.csv or zero rows written: check data guardrails, duplicate timestamp removal, historian coverage (check_historian_data.py, _log_historian_overview in sql_batch_runner).--clear-cache.thresholds.clip_z, enable per-regime thresholds (DET-07), or lower fusion weights for noisy heads via config or auto-tuning.enable_sql_sink not disabled, SQL connectivity via OutputManager diagnostic, table existence (check_tables_existence.py).features.top_k_tags), or limit max models in regimes auto-k.configs/config_table.csv or SQL ACM_Config; keep config_signature changes in mind for cache invalidation.OutputManager.ALLOWED_TABLES and add write helpers + analytics generators to keep file/SQL parity.REGIME_MODEL_VERSION to invalidate stale caches.Console with module-level overrides (--log-module-level core.fast_features=DEBUG).File mode row (config_table.csv)
EquipID,Category,ParamPath,ParamValue,ValueType
0,data,train_csv,data/FD_FAN_BASELINE_DATA.csv,string
0,data,score_csv,data/FD_FAN_BATCH_DATA.csv,string
0,data,timestamp_col,Timestamp,string
0,data,sampling_secs,60,int
0,features,window,16,int
0,models,pca.n_components,5,int
0,models,iforest.contamination,0.001,float
0,fusion,weights,"{""pca_spe_z"":0.25,""pca_t2_z"":0.25,""iforest_z"":0.2,""gmm_z"":0.15,""omr_z"":0.15}",json
0,output,dual_mode,false,bool
SQL mode row (config_table.csv)
EquipID,Category,ParamPath,ParamValue,ValueType
1,data,storage_backend,sql,string
1,data,timestamp_col,EntryDateTime,string
1,data,sampling_secs,60,int
1,models,pca.incremental,true,bool
1,runtime,reuse_model_fit,false,bool
1,output,dual_mode,true,bool
SQL connection (configs/sql_connection.ini)
[sqlserver]
driver=ODBC Driver 17 for SQL Server
server=YOUR_SQL_SERVER_HOST
database=ACM
uid=acm_user
pwd=***REDACTED***
trust_certificate=yes
timeout=30
acm_main.timestamp_col config (defaults to Timestamp/EntryDateTime); historian uses EntryDateTime.data.sampling_secs; cadence check warns when drifted; SmartColdstart enforces window discovery.features.window) should reflect process dynamics (too small = noisy, too large = lag).thresholds.clip_z (e.g., 8–10) to avoid extreme leverage.python -m core.acm_main --equip FD_FAN --train-csv data/FD_FAN_BASELINE_DATA.csv --score-csv data/FD_FAN_BATCH_DATA.csv --log-level INFOartifacts/FD_FAN/run_<ts>/scores.csv, episodes.csv, meta.json exist; inspect fused z distribution and episode count._log_historian_overview (runs inside sql_batch_runner); or run scripts/check_historian_data.py.python -m core.acm_main --equip FD_FAN --log-level INFO (SQL mode default). Check ACM_Runs row inserted and Scores_Wide populated.--clear-cache and compare PCA/IFOREST model hashes in detectors.joblib to ensure cache invalidation._compute_drift_trend; spikes may warrant refit.--clear-cache or expand baseline window.check_tables_existence.py), and connectivity.artifacts/{equip}/models/refit_requested.flag._compute_config_signature hashes model/feature/fusion/regime/threshold sections; cache reuse only when hash matches.REGIME_MODEL_VERSION (regimes.py) gates loading; bump when changing clustering logic or feature basis.artifacts/{equip}/models; retrain when data hash changes.OutputManager.ALLOWED_TABLES, add write helpers, and align SQL schemas before deployment.configs/sql_connection.ini out of VCS; use .example.ini as template. Do not commit real passwords/DSNs.SqlLogSink.SqlLogSink is enabled only where allowed.